1. Introduction

As the number of cases increases globally, governments and authorities have continued to use mobility restrictions that were, and still are, the only effective tool to control for the viral transmission. Yet, the relationship between public orders and behavioral parameters of social distancing observed in the community is a complex process and an important policy question. The evidence shows that adherence to public orders about the social distancing is not stable and fluctuates with degree of spatial differences in information and the level of risk aversion. This study aims to uncover these mobility dynamics and their effect on the COVID-19 spread in three major cities in North America. Without accounting for this dynamic structure, a naive calculation of correlations with any level of lagged mobility variations shows a strong negative relationship: as the mobility goes down, cases go up.

We develop a nonparametric model that captures this dynamic relationship to answer the following two related questions:

  1. When the varying delays in its effect on the spread are identified properly, what would be the overall effect of mobility restrictions?
  2. If there is any effect of mobility on the spread, how long would it take for restrictions to have the most robust positive effect?

Our findings reveal that the relationship between mobility and the spread is highly unstable even over short time intervals and, after controlling for it, the results show that the average correlation between mobility and the COVID-19 spread is as high as 0.77 in Montreal, 0.70 in Toronto, and 0.73 in New York, implying that restrictions might be more effective on curbing the spread in Montreal. However, when the responsiveness of positivity rates to mobility changes is quantified with elasticity calculations, it reverses the order: the COVID-19 spread is more sensitive to changes in mobility in New York than in Toronto and Montreal with elasticities calculated as 0.72, 0.55, and 0.23 respectively. In other words, the mobility restrictions are least capable in fighting with the spread in Montreal than other major cities.

These differences demonstrate that evaluating the efficacy of different local policies against COVID-19 requires a careful analysis: The correlations measure the intensity of relationship between the spread and mobility, and any value for the correlation between PR and mobility is perfectly compatible with a sensitive and insensitive PR.

In addition, we also find that the immediate effect of mobility restrictions on the spread is related to how effective the local contact tracing is when the community spread is rising. Long delays in the effects of mobility restrictions on the spread indicate a very poor contact tracing where the community transmission of COVID-19 is widespread. Although Toronto has a very long delay in the first half of the second wave, it later reduces the delay substantially in the second half. The average delay for seeing the first positive effects of mobility restrictions on the spread is 9.69 days in New York, 9.75 days in Toronto, and 9.39 days in Montreal.

This work analyzes the effect of mobility restrictions on the COVID-19 spread for three major cities in North America: Montreal, Toronto, and New York City. The data on Montreal is from INSPQ and obtained by using an unofficial API that powers the INSPQ’s own dashboard (see: https://github.com/SimonCoulombe/covidtwitterbot). Otherwise the data is not publicly available. A limited version of the data on Toronto is publicly available and obtained from the Ontario government. We used these publicly available data to extract daily frequencies in Toronto. The data on New York is publicly available at New York State’s website. This study aims to capture the dynamic relationship between mobility restrictions and the spread. We use positivity rates that reflect the spread and all_day_bing_tiles_visited_relative_change from Facebook, which measures positive or negative change in movement relative to baseline in Montreal and Toronto.

There are three metrics that we developed to measure the effect of social mobility on the spread: the correlation that reflects the overall effectiveness mobility restrictions, the sensitivity of the spread to mobility restrictions, and the average delay in the effect of these restrictions that reflects how efficient the contact tracing is when the community spread rises.

This preliminary work reveals the codes and data sources with their results at the same time. To the best of our knowledge, it is the first methodological work that can be used to compare different regions in terms of the efficacy of their mobility related local public health policies.

2. Montreal

2.1. Data

library(RCurl)
setwd("~/Dropbox/RNS2/Quebec")

# In every run it will update
URL <- "https://www.inspq.qc.ca/sites/default/files/covid/donnees/covid19-hist.csv"
x <- getURL(URL)
outt <- read.csv(textConnection(x))
saveRDS(outt, file = "quebec_uptodate.rds")

out <- readRDS(file = "quebec0204.rds")
head(out[1:6])
##         ï..Date Regroupement Croisement                           Nom
## 1 Date inconnue      Région      RSS01             Bas-Saint-Laurent
## 2 Date inconnue      Région      RSS02     Saguenay–Lac-Saint-Jean
## 3 Date inconnue      Région      RSS03            Capitale-Nationale
## 4 Date inconnue      Région      RSS04 Mauricie et Centre-du-Québec
## 5 Date inconnue      Région      RSS05                        Estrie
## 6 Date inconnue      Région      RSS06                     Montréal
##   cas_cum_lab_n cas_cum_epi_n
## 1             0             0
## 2             0             0
## 3             0             0
## 4             0             0
## 5             0             0
## 6             0             0

Cleaning the data and getting only Montreal:

# Removing "bad" dates and converting dates
data <- out[-c(1:43), ]
data$date <- as.Date(data$ï..Date, "%Y-%m-%d") 
data <- data[, c(49, 2:48)]
row.names(data) <- 1:nrow(data)

#Changing Montreal's name
ind <- grep("Montr", data$Nom)

# Getting only Montreal
dfmontr <- data[ind, ]
dfmontr_v1 <- data[data$Croisement=="RSS06", ]

Here is the partial dictionary from https://github.com/SimonCoulombe/covidtwitterbot about the features in the data:

cas_cum_tot_n = number of confirmed cases (cumulative) according to declaration date
cas_cum_lab_n = number of laboratory-confirmed cases (cumulative) according to declaration date
cas_cum_epi_n = number of confirmed cases by epidemiological link (cumulative) according to declaration date
cas_quo_tot_n = number of confirmed cases (daily) according to declaration date
cas_quo_lab_n = number of laboratory-confirmed cases (daily) according to declaration date cas_quo_epi_n = number of cases confirmed by epidemiological link (daily) according to declaration date

act_cum_tot_n = number of active cases (today)

ret_cum_tot_n = number of recovered cases (cumulative
ret_quo_tot_n = number of recovered cases (daily)

dec_cum_tot_n = total deaths (cumulative)
dec_cum_chs_n = total deaths in CHSLDs (cumulative)
dec_cum_rpa_n = total deaths in RPA (cumulative)
dec_cum_dom_n = total deaths at home and unknown (cumulative)
dec_cum_aut_n = total deaths in RI and other (cumulative)
dec_quo_tot_n = total deaths (daily)
dec_quo_chs_n = total deaths in CHSLDs (daily)
dec_quo_rpa_n = total deaths in RPA (daily)
dec_quo_dom_n = total deaths at home and unknown (daily)
dec_quo_aut_n = total deaths in IR and other (daily)

hos_quo_tot_n = new hospitalizations (regular + intensive care)
hos_quo_reg_n = new non-intensive care hospitalizations
hos_quo_si_n = new intensive care hospitalizations

psi_cum_test_pos = cumulative confirmed cases
psi_cum_test_inf = disabled people (negative test) cumulative
psi_cum_test_n = cumulative number of people tested cumulative
psi_quo_test_pos = daily confirmed cases ## ATTENTION this column is empty for the last day
psi_quo_test_inf = negative daily (ATTENTION cet_test_inf) ## daily column is empty for the last day
psi_quo_test_n = cumulative number of people tested daily ## ATTENTION this column is empty for the last day

2.2. Epi vs. PR plots

df <- dfmontr_v1[dfmontr_v1$date > "2020-02-29", ]
df <- df[-nrow(df),] #2/3 is zero

# Cases
df$psi_quo_pos_n <- as.numeric(df$psi_quo_pos_n)
cases <- df$psi_quo_pos_n # + tests
casedf <- data.frame(date = df$date, days = 1:length(cases), cases = cases)

# PR
df$psi_quo_pos_t <- as.numeric(df$psi_quo_pos_t)
pr <- df$psi_quo_pos_t #PR
prdf <- data.frame(date = df$date, days = 1:length(pr), pr = pr)

# Tests
df$psi_quo_tes_n <- as.numeric(df$psi_quo_tes_n)
test <- df$psi_quo_tes_n 
tesdf <- data.frame(date = df$date, days = 1:length(pr), test = test)

# Smoothing with loess()
loepr <- loess(prdf$pr~prdf$days, degree=2, span = 0.01)
fitpr <- predict(loepr, prdf$days)
loecs <- loess(casedf$cases~casedf$days, degree=2, span = 0.01)
fitcs <- predict(loecs, casedf$days)
loetes <- loess(tesdf$test~tesdf$days, degree=2, span = 0.01)
fittes <- predict(loetes, tesdf$days)

#Plots
par(mfrow = c(1,2))
plot(cases, type = "h", col = "pink",
      main = "Number of Positive Tests", xlab = "Days",
     ylab = "+ Tests", cex.main = 0.8, cex.lab=0.6, cex.axis = 0.6)
lines(casedf$days, fitcs/1.5, col = "red", lwd = 1)
plot(pr, type = "h", col = "lightblue",
      main = "Positivity Rate - PR", xlab = "Days",
     cex.main = 0.6, cex.lab=0.6, cex.axis = 0.6)
lines(prdf$days, fitpr/1.5, col = "darkblue", lwd = 1)

par(mfrow = c(1,2))
plot(test, type = "h", col = "lightgreen",
      main = "Number of Tests", xlab = "Days", 
     cex.main = 0.6, cex.lab=0.6, cex.axis = 0.6)
lines(tesdf$days, fittes/1.5, col = "darkgreen", lwd = 1)
plot(cases, type = "h", col = "pink", ylab = "+Tests & PR",
     xlab = "Days", main = "PR and + Tests", 
     cex.main = 0.6, cex.lab=0.6, cex.axis = 0.6)
lines(casedf$days, fitcs/1.5, col = "red", lwd = 1)
lines(prdf$days, fitpr*40, col = "blue", lwd = 1)
legend("topright", legend = c("PR", "+ Tests"),
       lwd = 2, col = c("blue", "red"), cex = 0.4)

It’s clear from these plots that the early numbers until day 100 are not usable due to very low tests numbers. Hence we will look at the epi curve after day 100.

2.3. Naive correlation between Mobility and PR

library(haven)
mobdf <- read_dta("movement-range-data-2021-02-06/montrealmob.dta")[]
mob1 <- mobdf[1:nrow(df),1:2]

workingdf <- data.frame(Date = df$date, day = 1:nrow(df), dpr = prdf$pr, mob1$all_day_bing_tiles_visited_relat)
names(workingdf)[4] <- "mob1"
head(workingdf)
##         Date day   dpr     mob1
## 1 2020-03-01   1  0.00  0.02466
## 2 2020-03-02   2 11.11 -0.05464
## 3 2020-03-03   3  0.00 -0.04480
## 4 2020-03-04   4  0.00 -0.04536
## 5 2020-03-05   5  0.00 -0.01491
## 6 2020-03-06   6  0.00 -0.04394
library(ggplot2)
library(dplyr)
library(patchwork) 
library(hrbrthemes)

pdata <- data.frame(mob = workingdf$mob1, pr = workingdf$dpr,
                    Date = as.Date("2020-03-01") + 1:(nrow(workingdf)))

coeff <- 20

# A few constants
dcolor <- "#69b3a2"
prcolor <- rgb(0.2, 0.6, 0.9, 1)

ggplot(pdata, aes(x=Date)) +
  
  geom_bar( aes(y=pr), stat="identity", size=.1, fill=dcolor,
            color="black", alpha=.4) + 
  geom_line( aes(y=mob*coeff), size=0.5, color=prcolor) +
  
  scale_y_continuous(
    
    # Features of the first axis
    name = "Positivity Rates",
    
    # Add a second axis and specify its features
    sec.axis = sec_axis(~./coeff, name="Mobility Changes")
  ) + 
  
  theme_ipsum() +

  theme(
    axis.title.y = element_text(color = dcolor, size=13, face = "bold"),
    axis.title.y.right = element_text(color = prcolor, size=13, face = "bold"),
    plot.title = element_text(size = 15, face = "bold")
  ) +
 
  ggtitle("Mobility restrictions vs PR - Montreal")

Here is a naive way of looking at the relationship by cross correlations with and without first-differencing:

library(astsa)
#bbb <- ccf(workingdf$mob1, workingdf$dpr, na.action = na.pass)

x <- workingdf$mob1
y <- workingdf$dpr

lag2.plot(x, y, 15)

lag2.plot(diff(x), diff(y), 15)

Cross correlations use the whole data with varying lags up to 21. As it’s clear from the plots, the correlation between mobility and PR is negative for all lags. In other words, as mobility goes down, the spread rises.

2.4. Trainable Dynamic Functional Connectivity

In order to calculate the dynamic nature of this relationship, we develop a nonparameteric approach based on sliding the window correlation (SWC) method, which is a straightforward and common approach for evaluating dynamic functional connectivity. Despite the fact that sliding window analyses have been long used, there are still considerable technical issues associated with the approach.

A great effort has recently been dedicated to investigate the window setting effects on dynamic connectivity estimation. For example, a very recent review on SWC and how it is used in neuroscience is provided by Hutchison et al. (2013). They show that analogous to a moving average function, a sliding window analysis computes a succession of pairwise correlation matrices using the time series from a given parcellation of brain regions. However, despite the growing success of this methodology in neuroimaging, the sliding window technique has multiple parameters such as window function, length and step size that must be set. But the appropriate settings remain unknown due to lack of ground truth, for example, in resting state fMRI data (Mokhtari et al. 2019).

Different than applications of SWC in neuroimaging, we have an unquestionable epidemiological “ground truth” indicating that the spread of an infectious disease must be positively related to the intensity of contacts in a population, which is the core idea in the so-called SIR model). However, the problem of “window-size” is identified in neuroscience: increasing a window length results in decreasing the sensitivity for identifying fast changes with very long windows eventually measuring static connectivity. On the other hand, shorter windows can increase sensitivity for detecting short transition states but at the expense of increasing the spurious fluctuations in the dynamic connectivity Leonardi and Van De Ville, 2015.

Thus, it is essential to determine a window length that allows reducing spurious fluctuations and at the same time capturing faster dynamic correlations. One of the most suggested method to address spurious fluctuations in dynamic functional connectivity is to estimate a method when the window length is not shorter than the largest wavelength present in both series, which is about 7 days in positivity rates and mobility series. This is also verified in the literature as “weekend effect”.

We borrow this method that has been used for understanding dynamic functional connectivity and suggestions in its applications in the neuroscience literature and create a grid search that finds the optimal lag to maximize the positive correlations in 7-day sliding windows.

2.5. Wavelet and Spectral Density in series

Most approaches to finding periodic behavior assume that the underlying series are stationary. For nonstationary series, wavelets have been developed to summarize the variation in frequency composition through time. In our case, both series, PR and mobility, are nonstationary. Wavelets allow us to study localized periodic behavior. In particular, we look for regions of high-power in the frequency-time plot.

library(Rwave)
library(lattice)

x <- as.data.frame(workingdf$dpr)
x <- x[complete.cases(x[,1]),]
x <- x[-c(1:100)]

#from https://ms.mcmaster.ca/~bolker/eeid/2010/Ecology/
mk.cwt = function(thisz, noctave, nvoice, moments=30, smooth=T, smoothsd=1, filtradius=3, do.center=F) {

    ret = cwtTh(thisz, noctave=noctave, 
        nvoice=nvoice, moments=moments, plot=F)
    if (do.center) {
        ret = ret/sum(ret)
    }
    return(list(cwt=ret, noctave=noctave, nvoice=nvoice, moments=moments))
}

plot.cwt = function(tmp, mod=T, ylab='period', xlab='', main='', cex=1.2, center=T) {
    if (mod) {
        tmp$cwt = Mod(tmp$cwt)
    }
    if (center) {
        tmp$cwt = tmp$cwt/sum(tmp$cwt)
    }
    breaks.at = pretty( range(tmp$cwt), n=100)
    my.pal = rev(rainbow(length(breaks.at) , start=0, end=4/6))
    myplot = levelplot(tmp$cwt, pretty=F,
        scales=list(
            y=list(tick.number=tmp$noctave, labels=format(2^(1+seq(from=0,by=tmp$nvoice, 
                                to=tmp$noctave*tmp$nvoice)/tmp$nvoice))), 
            cex=cex
        ), 
        ylab=list(label=ylab, cex=cex),
        xlab=list(label=xlab, cex=cex),
        colorkey=list(labels=list(cex=cex)),
        aspect='fill', 
        main=main,
        col.regions=my.pal, 
        cuts=length(my.pal)-1)
    return(myplot)
}

tmp<-mk.cwt(x,noctave = floor(log2(length(x)))-1,nvoice=10)
plot.cwt(tmp,xlab="time (units of sampling interval)",
         main = "Wavelet of PR after Day 7")

The intensity of the colormap represents the variance of the time series that is associated with particular frequencies (y-axis) through time (x-axis). Our wavelet analysis is able to detect frequencies that are localized in time, and therefore if the dominant period of a time series changes over time, wavelets can be used to detect this transition. The map shows that around days 7 and 8, the \(2^{nd}\) wave shows a dominant variation. We ignore the higher variations around day 100 which captures the increasing variations during the \(1^{th}\) and \(2^{nd}\) waves of the epidemic. This is also captured by the spectral analysis applied on the first-differenced PR and mobility series after Day 100, which indicates the same frequency, 7 days, in both series.

#http://web.stanford.edu/class/earthsys214/notes/series.html
library(zoo)
dspect <- spectrum(diff(x[-c(1:100)]), log="no", spans=c(2,2), plot=FALSE)
specx <- 1/dspect$freq
specy <- 2*dspect$spec
plot(specx, specy, xlab="Period (days)", ylab="Spectral Density", type="l",
     xlim = c(0, 40), main = "Spectral Density of PR",
     col = "blue", lwd = 2)
text(15, 10 , paste0("Maximum denisty at Day ", specx[which.max(specy)]))

library(zoo)
dspect <- spectrum(diff(y[-c(1:100)]), log="no", spans=c(2,2), plot=FALSE)
specx <- 1/dspect$freq
specy <- 2*dspect$spec
plot(specx, specy, xlab="Period (days)", ylab="Spectral Density", type="l",
     xlim = c(0, 40), main = "Spectral Density of Mobility",
     col = "red", lwd = 2)
text(15, 10 , paste0("Maximum denisty at Day ", specx[which.max(specy)]))

2.6. SWC without dynamic lag control

library(dplyr)
library(zoo)

lag = 1:21
w = 7:48
grid <- as.matrix(expand.grid(lag, w))

rol <- c()
for(i in 1:nrow(grid)) {
 montl <- workingdf[2:nrow(workingdf), ] %>%
            mutate(dprL = lead(dpr, grid[i,1]))
 montll <- montl[complete.cases(montl), 4:5]  
 tmp <- rollapply(montll, width=grid[i,2], function(x) cor(x[,1],x[,2]),
                  by.column=FALSE)
 sub <- tmp[100:length(tmp)]
 #rol[i] <- mean(sub[sub > 0])
 rol[i] <- mean(sub)
}

opt <- grid[which(rol == max(rol)), ]

montl <- workingdf[2:nrow(workingdf), ] %>%
            mutate(dprL = lead(dpr, opt[1]))
montll <- montl[complete.cases(montl), 4:5]  
roll <- rollapply(montll, width=opt[2], function(x) cor(x[,1],x[,2]),
                  by.column=FALSE)

loerc <- loess(roll~c(1:length(roll)), degree=2, span = 0.02)
fitrc <- predict(loerc, 1:length(roll))

plot(roll[100:length(roll)], type = "h", lwd = 2, col = "lightgrey",
     main = paste0("Correlations with ", opt[1],
                   "-day delay and ", opt[2]/7,"- week sliding window"),
     xlab = "Days", ylab = "Correlation",
     cex.lab=0.8, cex.main = 0.9)
#lines(fitrc[100:length(roll)]/1.3, col = "red", lwd = 1.5)
lines(fitpr[100:length(roll)]/13, col = "darkblue", lwd = 1.5)
legend("bottomleft", legend = c("PR", "Corr"),
       lwd = 2, col = c("blue", "grey"), cex = 0.7)

We removed the first 100 days, between March 1 and June 8, which is the period when only symptomatic people tested due to few available tests. The dark-blue line shows the smoothed (and scaled) PR numbers and the green line is the smoothed rolling correlations. There are two sections in the graph: between Day 111 (September 27) and 192 (December 17) and the periods before and after this section. Between September 27 and December 17, the correlation is always positive without cyclical positive and negative swings. Before September 27, the spread is subsided with very low PR numbers. During the second wave with increasing daily PR numbers, the correlation between mobility and PR becomes positive, stable, and substantial. Although conventional community spreads take about 14 days to develop symptoms, our grid search captures the most positive relationship between mobility and PR at Lag 2. This could be the result of an efficient contact tracing: when somebody had a positive test results, all his/her contacts are called for testing, which reduces the standard 14-day time in community spread.

But, why does it turn to be a negative correlation after December 17? That’s the point where the daily positivity rates spike very high. One reason would be a change in the delay between mobility and PR due to perhaps an increasing community spread that exceeds the earlier period, and therefore, the contact tracing would not anymore be so efficient. We will have another algorithm that targets the section after December 17 to see if any increasing delay would make that part positive:

library(dplyr)
library(zoo)

lag = 1:21
w = 7
grid <- as.matrix(expand.grid(lag, w))

rol <- c()
for(i in 1:nrow(grid)) {
 montl <- workingdf[2:nrow(workingdf), ] %>%
            mutate(dprL = lead(dpr, grid[i,1]))
 montll <- montl[complete.cases(montl), 4:5]  
 tmp <- rollapply(montll, width=grid[i,2], function(x) cor(x[,1],x[,2]),
                  by.column=FALSE)
 sub <- tmp[292:length(tmp)]
 rol[i] <- mean(sub)
}

opt <- grid[which(rol == max(rol)), ]

montl <- workingdf[2:nrow(workingdf), ] %>%
            mutate(dprL = lead(dpr, opt[1]))
montll <- montl[complete.cases(montl), 4:5]  
roll <- rollapply(montll, width=opt[2], function(x) cor(x[,1],x[,2]),
                  by.column=FALSE)

loerc <- loess(roll~c(1:length(roll)), degree=2, span = 0.05)
fitrc <- predict(loerc, 1:length(roll))

plot(roll[292:length(roll)], type = "h", lwd = 2, col = "red",
     main = paste0("Rolling correlation with ", opt[1],
                   "-day delay and ", opt[2]/7,"-week window"),
     xlab = "Days", ylab = "Correlation",
     ylim = c(-0.2, 1.1), cex.main = 1)
lines(fitrc[292:length(roll)], col = "green", lwd = 2)
lines(fitpr[292:length(roll)]/12, col = "darkblue", lwd = 2)
legend("bottomleft", legend = c("PR", "Corr"),
       lwd = 2, col = c("blue", "green"), cex = 0.75)

Now, as it’s clear from this graph, which magnifies the period after December 17, the maximum positive correlation can be obtained by Lag 11. This verifies the fact that a common lag should not be applied to the whole epidemiological curve. Instead, the delay between mobility and PR has to be tuned in each rolling window separately to see which day(s) in the past would have the maximum positive impact. This is what we will do now.

2.7. Intesity of dynamic relationships

library(dplyr)
library(zoo)
library(forecast)

lag = 1:21
w = 7

mco <- c()
lagg <- c()

mdf <- workingdf[2:nrow(workingdf), ]
n <- nrow(mdf)-w 

unrot <- vector(mode = "list", length = n)

for(j in 1:n){
  co <-c()
  for(i in 1:length(lag)) {
    mdf1 <- mdf %>% mutate(dprL = lead(dpr, lag[i]))
    mdf2 <- mdf1[complete.cases(mdf1), 4:5]  
    k <- j-1
    mdf3 <- mdf2[j:(w+k), ]
    co[i] <- cor(mdf3[,1], mdf3[,2])
    unrot[[j]][[i]] <- cbind(ndiffs(mdf3[,1]), ndiffs(mdf3[,2]))
  }
  mco[j] <- max(co, na.rm = TRUE)
  lagg[j] <- lag[which.max(co)]
}

roll <- mco[-length(mco)]
laga <- lagg[-length(lagg)]

loerc <- loess(roll~c(1:length(roll)), degree=2, span = 0.02)
fitrc <- predict(loerc, 1:length(roll))

plot(roll[100:length(roll)], type = "h", lwd = 2, col = "pink",
     main = "Rolling correlations with tuned delay and 7-day windows",
     xlab = "Days", ylab = "Correlation",
     ylim = c(0, 1.1), cex.main = 1)
lines(fitrc[100:length(roll)], col = "red", lwd = 1.5)
lines(fitpr[100:length(roll)]/12, col = "darkblue", lwd = 1.5)
abline(a = mean(roll[100:length(roll)]), b = 0, type = "n",
       lty = "dotted", col = "red")
text(50, 0.4 , paste0("mean corr = ", round(mean(roll[100:length(roll)]), 4)))
legend("topleft", legend = c("PR", "Corr"),
       lwd = 2, col = c("blue", "red"), cex = 0.7)

library(ggplot2)
library(dplyr)
library(patchwork) 
library(hrbrthemes)

pdata <- data.frame(roll = roll[100:length(roll)],
                    Date = as.Date("2020-06-08") + 0:(length(roll)-100),
                    "pr" = fitpr[100:length(roll)])

coeff <-10.5
# A few constants
dcolor <- "#69b3a2"
prcolor <- rgb(0.2, 0.6, 0.9, 1)

ggplot(pdata, aes(x=Date)) +
  
  geom_bar( aes(y=roll), stat="identity", size=.1, fill=dcolor,
            color="black", alpha=.4) + 
  geom_line( aes(y=pr/coeff), size=1, color=prcolor) +
  
  scale_y_continuous(
    
    # Features of the first axis
    name = "Correaltion", limits = c(0,1.3),
    
    # Add a second axis and specify its features
    sec.axis = sec_axis(~.*coeff, name= "Positivity Rate")
  ) + 
  
  theme_ipsum() +

  theme(
    axis.title.y = element_text(color = dcolor, size=13, face = "bold"),
    axis.title.y.right = element_text(color = prcolor, size=13, face = "bold"),
    plot.title = element_text(size = 15, face = "bold")
  ) +
  
  geom_abline(intercept = mean(pdata$roll), slope = 0, linetype=3,
              color = "red") +
  annotate("text", col = "darkred", x= as.Date("2020-07-20"), y=0.85,
           label= paste0("Average Correlation = ",
                         round(mean(pdata$roll),2))) +

  ggtitle("Time-varying relationship between Mobility & PR", subtitle = "Montreal (Second Wave)")

As explained before, we ignore the first 100 days due to misleading PR values. The graph above shows the nonparametric estimation of rolling correlations (red) between mobility and PR. The average is very high around 77%. We will compare this correlation with Toronto and NYC later to see how effective the similar policies in two different big cities.

2.8. Robustness

When time series data are used, the cross-correlation can be impacted by a possible dependence within-series. Although our use of very short window spans and that we only calculate empirical correlations, in many cases the within-series dependence should be removed first. Otherwise, the spurious correlation would be a problem where series may appear correlated, but the correlation itself would be meaningless. We test the mobility and PR series in every rolling window at the optimal lag corresponding to that window to see the presence of nonstationarity. Although very short intervals make the reliability of available tests questionable, our results show that few windows show nonstationarity.

unrt <- matrix(0, length(lagg), 2)
for (i in 1:length(lagg)) {
  unrt[i, ] <- unrot[[i]][[lagg[i]]]
}
results <- data.frame(Window = 1:length(lagg), Lag = lagg, UR4mob = unrt[,1], UR4PR = unrt[,2])
head(results)
##   Window Lag UR4mob UR4PR
## 1      1   2      0     1
## 2      2   2      0     0
## 3      3   2      0     0
## 4      4   2      0     0
## 5      5   2      0     0
## 6      6   2      0     0

To address this possible problem, we take daily growth rates of both series. The results presented here show that the correlations we calculated earlier are not spurious.

library(dplyr)
library(zoo)
library(forecast)

lag = 1:21
w = 7

mco <- c()
lagg <- c()

mdf <- workingdf[9:nrow(workingdf), ]
mdf$dpr <- diff(mdf$dpr)/mdf$dpr
mdf$mob1 <- diff(mdf$mob1)/mdf$mob1
n <- nrow(mdf)-w 

for(j in 1:n){
  co <-c()
  for(i in 1:length(lag)) {
    mdf1 <- mdf %>% mutate(dprL = lead(dpr, lag[i]))
    mdf2 <- mdf1[complete.cases(mdf1), 4:5]  
    k <- j-1
    mdf3 <- mdf2[j:(w+k), ]
    co[i] <- cor(mdf3[,1], mdf3[,2])
  }
  mco[j] <- max(co, na.rm = TRUE)
  lagg[j] <- lag[which.max(co)]
}

rollg <- mco[-length(mco)]
lagag <- lagg[-length(lagg)]

loerc <- loess(rollg~c(1:length(rollg)), degree=2, span = 0.02)
fitrc <- predict(loerc, 1:length(rollg))

plot(roll[100:length(rollg)], type = "h", lwd = 2, col = "pink",
     main = "Using growth in series with tuned delay and 7-day windows",
     xlab = "Days", ylab = "Correlation",
     ylim = c(0, 1.1), cex.main = 1)
lines(fitrc[100:length(rollg)], col = "red", lwd = 1.5)
lines(fitpr[100:length(rollg)]/12, col = "darkblue", lwd = 1.5)
abline(a = mean(rollg[100:length(rollg)]), b = 0, type = "n",  lty = "dotted",
       col = "red")
text(50, 0.5 , paste0("mean corr = ", round(mean(rollg[100:length(rollg)]), 4)))
legend("topleft", legend = c("PR", "Corr"),
       lwd = 2, col = c("blue", "red"), cex = 0.7)

2.9. Dynamics of the relationship

Now we look at how many days of delay between mobility and PR in each 7-day rolling window is identified by the algorithm.

loel <- loess(laga~c(1:length(laga)), degree=2, span = 0.05)
fitl <- predict(loel, 1:length(laga))

plot(laga[100:length(laga)], type = "h", lwd = 2, col = "lightgrey",
     main = "Delays in the effect of mobility restrictions",
     xlab = "Days", ylab = "Mobility Delays - Day",
     ylim = c(1, 21), cex.main = 1)
lines(fitl[100:length(laga)], col = "red", lwd = 1.5)
lines(fitpr[100:length(roll)]*1.6, col = "darkblue", lwd = 1.5)
abline(a = mean(laga[100:length(laga)]), b = 0,  lty = "dotted")
text(50, 5 , paste0("Average Delay = ", round(mean(laga[100:length(laga)]),2), " days"))
legend("topleft", legend = c("PR", "Delay"),
       lwd = 2, col = c("blue", "red"), cex = 0.7)

library(ggplot2)
library(dplyr)
library(patchwork) 
library(hrbrthemes)

pdata <- data.frame(roll = roll[100:length(roll)], laga = laga[100:length(laga)],
                    Date = as.Date("2020-06-08") + 0:(length(roll)-100), "pr" = fitpr[100:length(roll)])

coeff <- 1.6

# A few constants
dcolor <- "#69b3a2"
prcolor <- rgb(0.2, 0.6, 0.9, 1)

ggplot(pdata, aes(x=Date)) +
  
  geom_bar( aes(y=laga), stat="identity", size=.1, fill=dcolor,
            color="black", alpha=.4) + 
  geom_line( aes(y=pr*coeff), size=1, color=prcolor) +
  
  scale_y_continuous(
    
    # Features of the first axis
    name = "Delays - Day",
    
    # Add a second axis and specify its features
    sec.axis = sec_axis(~./coeff, name="Positivity Rate")
  ) + 
  
  theme_ipsum() +

  theme(
    axis.title.y = element_text(color = dcolor, size=13, face = "bold"),
    axis.title.y.right = element_text(color = prcolor, size=13, face = "bold"),
    plot.title = element_text(size = 15, face = "bold")
  ) +
  
  geom_abline(intercept = mean(pdata$laga), slope = 0, linetype=3,
              color = "red") +
  annotate("text", x= as.Date("2020-07-20"), y=11,
           label= paste0("Mean Delay = ",
                         round(mean(pdata$laga),2),
                         " days")) +

  ggtitle("Delays in the effect of mobility restrictions - Montreal")

We can think that more immediate effects of mobility on PR happen when the effectiveness of contact tracing is high. When there is an increase in community spread without well identified sources, the effect of mobility on PR stretches back to the upper bound of the COVID-19 incubation period, which is estimated to be 14 days, or beyond. To understand how the time for observing the maximum effect of mobility restrictions varies and could be less or more than 14 days, we can look at a benchmark case, where there is no contact tracing and no mobility restrictions.

In this case, we assume that the incubation period for each person is 14 days and that they could be tested and get the result without any delay at the onset of symptoms. There is one infected person (John) and assume that John will be in contact with only 4 people at the same time during the incubation period (14 days). When there are no mobility restrictions, the total 4 new positive tests will be recorded at the onset of their symptoms after 14 days of incubation period. Now let’s reduce the mobility by 50%. The number of new positive test will fall from 4 to 2, but it will still take 14 days to see the reduction in positive test numbers from 4 to 2. When we employ a very effective contact tracing in addition to mobility restrictions, as soon as John shows up in the test center, they will call John’s two contacts, say, in two days. hence the effect of reduction in mobility will be observed in positive cases in 2 days as opposed to 14 days.

We can obviously think of an opposite case. When there is no effective contact tracing in place, the effect of restrictions would take more than 14 days if we introduce some logistical imperfections, such as delays in testing the symptomatic people, long gaps between testing and processing and so on.

2.10. Elasticities

The correlation captures the degree of relatedness between PR and mobility but cannot reveal the responsiveness of PR or to degree of which PR changes in response to changes in mobility. This is also called elasticity and can be defined as follows:

\[ \text { x-elasticity of } y: \epsilon=\frac{\partial y / y}{\partial x / x}, \] which is the ratio of the percentage change in one variable to the percentage change in another variable, when the latter variable has a causal influence on the former. It is a useful tool for measuring the sensitivity of one variable to changes in another, causative variable. It also has the advantage of being a unitless ratio, independent of the type of quantities being varied.

A slope coefficient in a single variable regression, \(y_{i}=\alpha+\beta x_{i}+\varepsilon_{i}\), can be expressed as

\[ \begin{aligned} \beta &=\frac{\sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)}{\sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)^{2}} \\ &=\frac{s_{x, y}}{s_{x}^{2}} \\ &=r\frac{s_{y}}{s_{x}} \end{aligned} \] where \(s_{x,y}\) is a simple covariance between \(x\) and \(y\), \(s^2_{x}\) is the variance of \(x\), \(r\) is the coefficient of correlation between the \(y\) and \(x\), and \(s_{y}\) and \(s_{x}\) are their standard deviations, respectively. Hence, the “mean” elasticity, \(\epsilon\) may be written with \(y= PR\) (Positivity Rate) and \(x=R\) (Restrictions), as follows

\[ \epsilon=\frac{\partial PR / PR}{\partial R / R} = r\frac{s_{pr}}{s_{r}}\frac{\bar{R}}{\bar{PR}} \]

It follows, therefore, that when \(r\) is in the neighborhood of 1, the spread will be more sensitive or less (i.e. \(\epsilon \lessgtr1\)) according as the spread of COVID-19 is more or less variable than the mobility (\(\frac{s_{PR}}{s_{R}}\)) and the magnitude of restrictions relative to how widespread PR is (\(\frac{\bar{x}}{\bar{y}}\)). This could be better seen in a simulations:

set.seed(1234567)
x1<-c(1,3,5,7,9)
x2 <- c(2,4,6,8,10)
y1 <- 3*x1 + rnorm(5,0,1)
y2 <- 10 + 6*x2 + rnorm(5,0,10)

plot(x2,y2, col="red", pch=16, xlim=c(0,10), ylim=c(0,80), xlab="x", ylab="y",
     main = "Higher correlation lower slope and elasticity")
points(x1,y1, col="blue", pch=16)
abline(lm(y1~x1), col="blue")
abline(lm(y2~x2), col="red")

text(2,40,paste("Correlation =",round(cor(x2,y2), digits = 3)))
text(6,5,paste("Correlation =",round(cor(x1,y1), digits = 3)))
text(2,45,paste("Slope =",round(coef(summary(lm(y2~x2)))["x2", "Estimate"], digits = 3)))
text(6,10,paste("Slope =",round(coef(summary(lm(y1~x1)))["x1", "Estimate"], digits = 3)))

Here are the the responsiveness of positivity rates to mobility changes where the intensity of their relationship (correlations) is at the maximum.

library(dplyr)
library(zoo)

lag = 1:21
w = 7

mel <- c()
sd <- c()
xy <- c()
mcor <- c()
beta_mtr <- c()
x <- c()

mdf <- workingdf[2:nrow(workingdf), ]
n <- nrow(mdf)-w 

for(j in 1:n){
  cor <-c()
  sdd<- c()
  xyy <- c()
  xx <- c()
  el <- c()
  
  for(i in 1:length(lag)) {
    mdf1 <- mdf %>% mutate(dprL = lead(dpr, lag[i]))
    mdf2 <- mdf1[complete.cases(mdf1), 4:5]  
    k <- j-1
    mdf3 <- mdf2[j:(w+k), ]
    cor[i] <- cor(mdf3[,1], mdf3[,2])
    sdd[i] <- sd(mdf3[,2])/sd(mdf3[,1])
    xyy[i] <- abs(mean(mdf3[,1]))/mean(mdf3[,2])
    xx[i] <- abs(mean(mdf3[,1]))
    el[i] <- cor[i]*sd[i]*xy[i]
  }
  mcor[j] <- cor[which(cor == max(cor, na.rm = TRUE))]
  sd[j] <- sdd[which(cor == max(cor, na.rm = TRUE))]
  xy[j] <- xyy[which(cor == max(cor, na.rm = TRUE))]
  x[j] <- xx[which(cor == max(cor, na.rm = TRUE))]
  
  mel[j] <- mcor[j]*sd[j]*xy[j]  
  beta_mtr[j] <- mcor[j]*sd[j]
}

saveRDS(beta_mtr, file = "beta_mtr.RDS") 
roll <- mel[-length(mel)]

loerc <- loess(roll~c(1:length(roll)), degree=2, span = 0.01)
fitrc <- predict(loerc, 1:length(roll))

plot(roll[100:length(roll)], type = "h", lwd = 2, col = "lightpink",
     main = "Responsiveness (Elasticity) of PR to Mobility",
     xlab = "Days", ylab = "Elasticity",
     ylim = c(0, 0.7), cex.main = 1)
lines(fitrc[100:length(roll)]*0.76, col = "red", lwd = 1.5)
lines(fitpr[100:length(roll)]*0.055, col = "darkblue", lwd = 1.5)
abline(a = mean(roll[100:length(roll)]), b = 0, type = "n",
        lty = "dotted", col = "red")
text(50, 0.5 , paste0("mean elasticity = ",
                      round(mean(roll[100:length(roll)]), 4)))
text(72, 0.4 , paste0("mean elasticity = ",
                      round(mean(roll[200:length(roll)]), 4), " after Day 200"))
legend("topleft", legend = c("PR", "Elas."),
       lwd = 2, col = c("blue", "red"), cex = 0.6)

This mean elasticity, 0.23, shows that PR is not sensitive to mobility changes on average when we consider the the whole period after Day 100. At the beginning of the \(2^{nd}\) wave this changes and the mean elasticity becomes 0.34, which is still low. For example, when the mobility goes down 10%, PR falls, on average, 3.4% during the period of \(2^{nd}\) wave. We will see in following sections that, although Toronto and New York City have lower correlation coefficients around 0.70, their elasticities are higher about 0.55 and 0.72, respectively. In other words, a 10-percent decline in mobility reduces PR 7.2%. That means that there are other things affecting the positivity rates that might not be captured by the mobility. This could be the implementations of restrictions that are different in each city or other measures such as differences in enforcement of mandatory masks, school closures, and so on.

Furthermore, it is also possible that Montreal may be at a point where mobility restrictions may not be as effective as in NYC and Toronto. In other words, the Montreal’s lower elasticity may indicate that the level of mobility restrictions could be inadequate relative to the level and the speed of the spread. This could be seen if take the ratio of mean of mobility to PR and plot it against the elasticity:

fck <- data.frame(roll = roll, xy = xy[-length(xy)])
fcor <- fck[order(fck[,2]), ]

loefc <- loess(roll~xy, degree=2, span = 0.4, data = fcor)
fitrc <- predict(loefc, fcor$xy)

plot(fcor$xy, fcor$roll, type = "h", col = "lightgrey", ylab = "Elasticity",
     xlab = "Mean of Mobility/PR", main = "Restrictions are more effective when the ratio of mobility to PR is high",
     cex.main = 0.8, cex.lab=0.7, cex.axis = 0.7)
lines(fcor$xy, fitrc, col = "red", lwd = 2)

The mean of mobility-PR ratio increases when mobility restrictions rise up more than PR (the mobility is taken as an absolute value in elasticity calculations). The plot shows that the elasticity (the responsiveness of PR) goes up as the ratio rises until about 0.03. When the ratio exceeds this point, however, the effectiveness of mobility restrictions dies slowly and stays almost constant. Beyond this point, the increase in the ratio would be mostly due to a decline in PR rather than increase in restrictions. The smoothed elasticity line can shift up, when the mobility-PR ratio is drastically higher, which explains the differences across cities. In Section 5, we investigate this issue: had Montreal had the same ratio as what NYC has, it would have been much higher elasticity than what NYC has.

fck <- data.frame(roll = roll, x = x[-length(x)])
fcor <- fck[order(fck[,2]), ]

loefc <- loess(roll~x, degree=2, span = 0.2, data = fcor)
fitrc <- predict(loefc, fcor$x)

plot(fcor$x, fcor$roll, type = "h", col = "lightgrey", ylab = "Elasticity",
     xlab = "Mean of Mobility", main = "Responsiveness of PR declines with more mobility restrictions ",
     cex.main = 0.8, cex.lab=0.7, cex.axis = 0.7)
lines(fcor$x, fitrc, col = "blue", lwd = 2)

This plot shows that the elasticity is almost constant when the mean of mobility is between 0.1 and 0.45, which indicates that the effectiveness of restrictions does lose its power in reducing the PR. In other words, the level decline in mobility does not create the same level increase in its effectiveness. Among many other reasons, this could happen when the decline in mobility is not proportional to the rise in PR to have the strong effect on the spread.

3. Toronto

3.1. Data

The number of tests, percent of positive results by age and location and time taken for test processing are obtained from https://covid-19.ontario.ca/data/testing-volumes-and-results#byPHU. The total test numbers reflect 7-day averages of how many tests were processed by the lab in each day. The available information also starts (as of 02-04-2021) from May 1. 2021. Here is the original 7-day average data:

library (Matrix)
library (matrixcalc)
library(readr)

rm(list = ls())

df <- data.frame(read_csv("~/Dropbox/RNS2/DataON/Rinda/Toronto_test_data.csv"))
test <- matrix(df[,2])
pr <- matrix(df[,3])

# Smoothing with loess()
t = 1:266
loepr <- loess(pr~t, degree=2, span = 0.02)
fitpr <- predict(loepr, t)
loecs <- loess(test~t, degree=2, span = 0.02)
fitcs <- predict(loecs, t)

#Plots
par(mfrow = c(1,2))
plot(test, type = "h", col = "pink", main = "Tests performed (7-day average)",
     xlab = "Days", cex.main = 0.8, cex.lab=0.7, cex.axis = 0.7)
lines(t, fitcs/1.2, col = "red", lwd = 1)
plot(pr, type = "h", col = "lightblue", main = "7-day average PR",
     xlab = "Days", cex.main = 0.8, cex.lab=0.7, cex.axis = 0.7)
lines(t, fitpr/1.2, col = "darkblue", lwd = 1)

In order to extract the daily tests, positivity rate (PR), and positive results (dtest, dpr, and dpos, respectively) from 7-day moving averages, we applied a simple linear algebra:

A <- as.matrix(band(matrix(1, nrow=272, ncol=272), -3, 3)[-c(1:3, 270:272),])
Ai <- svd.inverse (A)
mytest <- 7 * Ai %*% test
mypr <- 7 * Ai %*% pr

dailytest <- ceiling(mytest[-c(1:6), ])
dailypr <- mypr[-c(1:6), ]
dailypr <- ifelse(dailypr < 0, 0.01, dailypr)  

dff <- cbind(df, dtest = dailytest, dpr = dailypr)
dff$dpos <- round(dff$dtest*dff$dpr/100, 0)

head(dff, 10)
##           Date Total.tests.processed Positive.results dtest       dpr dpos
## 1  May 01 2020                  2442              9.5  2805 12.539700  352
## 2  May 02 2020                  2450              9.4  3524  6.654067  234
## 3  May 03 2020                  2460              9.6  3437  8.538682  293
## 4  May 04 2020                  2423              9.7  1942  8.413041  163
## 5  May 05 2020                  2408              9.6  1607  9.938682  160
## 6  May 06 2020                  2561              8.8  2192  7.102785  156
## 7  May 07 2020                  2575              8.9  2522  9.113041  230
## 8  May 08 2020                  2616              8.7  3092 11.139700  344
## 9  May 09 2020                  2664              8.5  3860  5.254067  203
## 10 May 10 2020                  2671              8.3  3486  7.138682  249
saveRDS(dff, file = "trdata.RDS") 

Here are the same graphs with daily numbers:

test <- dff[,4]
pr <- dff[,5]

# Smoothing with loess()
t = 1:266
loepr <- loess(pr~t, degree=2, span = 0.02)
fitpr <- predict(loepr, t)
loecs <- loess(test~t, degree=2, span = 0.02)
fitcs <- predict(loecs, t)

#Plots
par(mfrow = c(1,2))
plot(test, type = "h", col = "pink", main = "Daily tests performed",
     xlab = "Days", cex.main = 0.8, cex.lab=0.6, cex.axis = 0.6)
lines(t, fitcs/1.5, col = "darkgreen", lwd = 1)
plot(pr, type = "h", col = "lightblue", main = "Daily PR",
     xlab = "Days", cex.main = 0.8, cex.lab=0.6, cex.axis = 0.6)
lines(t, fitpr/1.5, col = "darkgreen", lwd = 1)

While it was not obvious when the data smoothed with a 7-day moving average, the daily test data now show clear drops in Victoria Day (May 18), Thanksgiving (Oct 12), Christmas (Dec 25), and New Year eve (Jan 1, 2021). We know that lower test numbers could translate into lower case numbers. This sort of bias in the case numbers are well reported in the literature. That’s why using PR is important. On the other hand, the positive test results calculated by (tests x PR) are different than officially reported case numbers for Toronto. This could be resulted by false positives.

3.2. Mobility and PR

# Adding mobility data
TorontoMob <- read_dta("~/Dropbox/RNS2/DataON/Rinda/TorontoMob.dta")
tordf <- data.frame(pr = dff$dpr, mob = TorontoMob$all_day_bing_tiles_visited_relat[62:327])
saveRDS(tordf, file = "tordf.RDS") 

# Rolling correlation
lag = 1:21
w = 7

mco <- c()
lagg <- c()

mdf <- tordf
n <- nrow(mdf)-w 

rol <- c()

for(j in 1:n){
  co <-c()
  for(i in 1:length(lag)) {
    mdf1 <- mdf %>% mutate(dprL = lead(pr, lag[i]))
    mdf2 <- mdf1[complete.cases(mdf1), ]  
    k <- j-1
    mdf3 <- mdf2[j:(w+k), ]
    co[i] <- cor(mdf3[,2], mdf3[,3])
  }
  mco[j] <- max(co, na.rm = TRUE)
  lagg[j] <- lag[which.max(co)]
}

roll <- mco[-length(mco)]
laga <- lagg[-length(lagg)]

loerc <- loess(roll~c(1:length(roll)), degree=2, span = 0.05)
fitrc <- predict(loerc, 1:length(roll))

plot(roll[100:length(roll)], type = "h", lwd = 2, col = "pink",
     main = "Rolling correlation with tuned delay and 7-week window",
     xlab = "Days", ylab = "Correlation",
     ylim = c(0, 1.1), cex.main = 1)
lines(fitrc[100:length(roll)], col = "red", lwd = 1.5)
lines(fitpr[100:length(roll)]/12, col = "darkblue", lwd = 1.5)
abline(a = mean(roll[100:length(roll)]), b = 0, type = "n",
       lty = "dotted", col = "red")
text(22, 0.5 , paste0("mean corr = ", round(mean(roll[100:length(roll)]),4)))
legend("topleft", legend = c("PR", "Corr"),
       lwd = 2, col = c("blue", "red"), cex = 0.7)

Now we look at how many days of delay between mobility and PR in each 7-day rolling window is identified by the algorithm.

loel <- loess(laga~c(1:length(laga)), degree=2, span = 0.05)
fitl <- predict(loel, 1:length(laga))

plot(laga[100:length(laga)], type = "h", lwd = 2, col = "lightblue",
     main = "Average delay in the effect of mobility restrictions on the spread",
     xlab = "Days", ylab = "Mobility Delays - Day",
     ylim = c(1, 21), cex.main = 1)
lines(fitl[100:length(laga)], col = "red", lwd = 1.5)
lines(fitpr[100:length(roll)]*1.6, col = "darkblue", lwd = 1.5)
abline(a = mean(laga[100:length(laga)]), b = 0, lty = "dotted", col = "blue")
text(19, 11 , paste0("mean = ", round(mean(laga[100:length(laga)]),2),
                     " days"))
legend("topleft", legend = c("PR", "Delay"),
       lwd = 2, col = c("blue", "red"), cex = 0.7)

3.3. Elasticity

library(dplyr)
library(zoo)

lag = 1:21
w = 7

mel <- c()
sd <- c()
xy <- c()
mcor <- c()
beta_tr <- c()

mdf <- tordf
n <- nrow(mdf)-w 

for(j in 1:n){
  cor <-c()
  sdd<- c()
  xyy <- c()
  el <- c()
  
  for(i in 1:length(lag)) {
    mdf1 <- mdf %>% mutate(dprL = lead(pr, lag[i]))
    mdf2 <- mdf1[complete.cases(mdf1), ]  
    k <- j-1
    mdf3 <- mdf2[j:(w+k), ]
    cor[i] <- cor(mdf3[,2], mdf3[,3])
    sdd[i] <- sd(mdf3[,3])/sd(mdf3[,2])
    xyy[i] <- abs(mean(mdf3[,2]))/mean(mdf3[,3])
    el[i] <- cor[i]*sd[i]*xy[i]
  }
  mcor[j] <- cor[which(cor == max(cor, na.rm = TRUE))]
  sd[j] <- sdd[which(cor == max(cor, na.rm = TRUE))]
  xy[j] <- xyy[which(cor == max(cor, na.rm = TRUE))]
  mel[j] <- mcor[j]*sd[j]*xy[j]
  beta_tr[j] <- mcor[j]*sd[j]
}

roll <- mel[-length(mel)]
saveRDS(beta_tr, file = "beta_tr.RDS") 

loerc <- loess(roll~c(1:length(roll)), degree=2, span = 0.01)
fitrc <- predict(loerc, 1:length(roll))

plot(roll[100:length(roll)], type = "h", lwd = 2, col = "pink",
     main = "Responsiveness (Elasticity) of PR to Mobility",
     xlab = "Days", ylab = "Elasticity",
     ylim = c(0, 2), cex.main = 1)
lines(fitrc[100:length(roll)]*0.6, col = "red", lwd = 1.5)
lines(fitpr[100:length(roll)]*0.15, col = "darkblue", lwd = 1.5)
abline(a = mean(roll[100:length(roll)]), b = 0, type = "n",
       lty = "dotted", col = "red")
text(30, 1.5 , paste0("mean elasticity = ",
                      round(mean(roll[100:length(roll)]), 4)))
text(45, 1.3 , paste0("mean elasticity = ",
                      round(mean(roll[200:length(roll)]), 4), " after Day 200"))
legend("topleft", legend = c("PR", "Elas."),
       lwd = 2, col = c("blue", "red"), cex = 0.6)

The results are interesting. There is a long period that the responsiveness is less than the average. However, a simple visual inspection shows that there are multiple sections where the high elasticity overlaps with low positivity rates.

When we compare both cities, this shows that, although the intensity of the relationship is almost the same in both cities, the responsiveness of PR to mobility is much higher in Toronto than in Montreal due to a higher ratio of PR and mobility variations and/or a higher ratio of their averages in Toronto relative to in Montreal. This means that mobility restrictions have much more impact on the COVID-19 spread in Toronto than Montreal.

4. New York City

4.1 PR and Mobility

rm(list = ls())

NYCmob <- read_dta("~/Dropbox/RNS2/Quebec/movement-range-data-2021-02-06/NYCmob.dta")
NYCtesting <- read_csv("~/Dropbox/RNS2/Quebec/New_York_State_Statewide_COVID-19_Testing.csv")
nyc <- data.frame(Date = NYCmob$ds,
                  mob = NYCmob$all_day_bing_tiles_visited_relat,
                  test = NYCtesting$`Total Number of Tests Performed`[1:343],
                  pt = NYCtesting$`New Positives`[1:343])
nyc$pr <- (nyc$pt/nyc$test)*100

test <- nyc$test
pr <- nyc$pr

# Smoothing with loess()
t = 1:343
loepr <- loess(pr~t, degree=2, span = 0.02)
fitpr <- predict(loepr, t)
loecs <- loess(test~t, degree=2, span = 0.02)
fitcs <- predict(loecs, t)

#Plots
par(mfrow = c(1,2))
plot(test, type = "h", col = "pink", main = "Daily tests performed",
     xlab = "Days", cex.main = 0.8, cex.lab=0.6, cex.axis = 0.6)
lines(t, fitcs/1.5, col = "darkgreen", lwd = 1)
plot(pr, type = "h", col = "lightblue", main = "Daily Positives",
     xlab = "Days", cex.main = 0.8, cex.lab=0.6, cex.axis = 0.6)
lines(t, fitpr/1.5, col = "darkgreen", lwd = 1)

par(mfrow = c(1,1))
plot(pr[100:343], type = "h", col = "lightgreen", main = "Daily PR - After Day 100",
     xlab = "Days", cex.main = 0.8, cex.lab=0.6, cex.axis = 0.6)
lines(1:244, fitpr[100:343]/1.3, col = "darkgreen", lwd = 1)

# Rolling correlations
lag = 1:21
w = 7

mco <- c()
lagg <- c()

mdf <- nyc[-c(1:2), c(2,5)]
n <- nrow(mdf)-w 
saveRDS(mdf, file = "NYCdf.RDS") 


rol <- c()

for(j in 1:n){
  co <-c()
  for(i in 1:length(lag)) {
    mdf1 <- mdf %>% mutate(dprL = lead(pr, lag[i]))
    mdf2 <- mdf1[complete.cases(mdf1), ]  
    k <- j-1
    mdf3 <- mdf2[j:(w+k), ]
    co[i] <- cor(mdf3[,1], mdf3[,3])
  }
  mco[j] <- max(co, na.rm = TRUE)
  lagg[j] <- lag[which.max(co)]
}

roll <- mco[-length(mco)]
laga <- lagg[-length(lagg)]

loerc <- loess(roll~c(1:length(roll)), degree=2, span = 0.05)
fitrc <- predict(loerc, 1:length(roll))

plot(roll[100:length(roll)], type = "h", lwd = 2, col = "pink",
     main = "Rolling correlation with tuned delay and 7-week window",
     xlab = "Days", ylab = "Correlation",
     ylim = c(0, 1.1), cex.main = 1)
lines(fitrc[100:length(roll)], col = "red", lwd = 1.5)
lines(fitpr[100:length(roll)]*22, col = "darkblue", lwd = 1.5)
abline(a = mean(roll[100:length(roll)]), b = 0, type = "n",
       lty = "dotted", col = "red")
text(35, 0.5 , paste0("mean corr = ", round(mean(roll[100:length(roll)]),4)))
legend("topleft", legend = c("PR", "Corr"),
       lwd = 2, col = c("blue", "red"), cex = 0.7)

Now we look at how many days of delay between mobility and PR in each 7-day rolling window is identified by the algorithm.

loel <- loess(laga~c(1:length(laga)), degree=2, span = 0.05)
fitl <- predict(loel, 1:length(laga))

plot(laga[100:length(laga)], type = "h", lwd = 2, col = "lightblue",
     main = "Average delay in the effect of mobility restrictions on the spread",
     xlab = "Days", ylab = "Mobility Delays - Day",
     ylim = c(1, 21), cex.main = 1)
lines(fitl[100:length(laga)], col = "red", lwd = 1.5)
lines(fitpr[100:length(roll)]*4.6, col = "darkblue", lwd = 1.5)
abline(a = mean(laga[100:length(laga)]), b = 0,
       lty = "dotted", col = "blue")
text(30, 11 , paste0("mean = ", round(mean(laga[100:length(laga)]),2),
                     " days"))
legend("topleft", legend = c("PR", "Delay"),
       lwd = 2, col = c("blue", "red"), cex = 0.7)

4.2 Elasticity

library(dplyr)
library(zoo)

lag = 1:21
w = 7

mel <- c()
sd <- c()
xy <- c()
mcor <- c()
beta_nyc <- c()

mdf <- nyc[-c(1:2), c(2,5)]
n <- nrow(mdf)-w 

for(j in 1:n){
  cor <-c()
  sdd <- c()
  xyy <- c()
  el <- c()
  
  for(i in 1:length(lag)) {
    mdf1 <- mdf %>% mutate(dprL = lead(pr, lag[i]))
    mdf2 <- mdf1[complete.cases(mdf1), ]  
    k <- j-1
    mdf3 <- mdf2[j:(w+k), ]
    cor[i] <- cor(mdf3[,1], mdf3[,3])
    sdd[i] <- sd(mdf3[,3])/sd(mdf3[,1])
    xyy[i] <- abs(mean(mdf3[,1]))/mean(mdf3[,3])
    el[i] <- cor[i]*sdd[i]*xyy[i]
  }
  mcor[j] <- cor[which(cor == max(cor, na.rm = TRUE))]
  sd[j] <- sdd[which(cor == max(cor, na.rm = TRUE))]
  xy[j] <- xyy[which(cor == max(cor, na.rm = TRUE))]
  mel[j] <- mcor[j]*sd[j]*xy[j]
  beta_nyc[j] <- mcor[j]*sd[j]
}

roll <- mel[-length(mel)]

loerc <- loess(roll~c(1:length(roll)), degree=2, span = 0.01)
fitrc <- predict(loerc, 1:length(roll))

plot(roll[100:length(roll)], type = "h", lwd = 2, col = "pink",
     main = "Responsiveness (Elasticity) of PR to Mobility after Day 100",
     xlab = "Days", ylab = "Elasticity",
     ylim = c(0, 2), cex.main = 1)
lines(fitrc[100:length(roll)]*0.6, col = "red", lwd = 1.5)
lines(fitpr[100:length(roll)]*0.44, col = "darkblue", lwd = 1.5)
abline(a = mean(roll[100:length(roll)]), b = 0, type = "n",
       lty = "dotted", col = "red")
text(57.5, 1.5 , paste0("mean elasticity = ", 
                      round(mean(roll[100:length(roll)]), 4)))
text(80, 1.25 , paste0("mean elasticity = ",
                      round(mean(roll[150:length(roll)]), 4), " after Day 150"))
legend("topleft", legend = c("PR", "Elas."),
       lwd = 2, col = c("blue", "red"), cex = 0.5)

The results show that the average responsiveness of PR to mobility restrictions is much higher in New York than both Montreal and Toronto.

5. Using elasticities for normative-referenced evaluations

Here, we will compare the effectiveness of mobility restrictions by using elasticity measures. We create a hypothetical case where we calculate elasticities for Montreal using the data for NYC. This shows how effective the mobility restrictions would have been if Montreal had had the same PR and mobility measures as NYC during the \(2^{nd}\) wave between September 16, 2020 and January 24, 2021. Here is the formal explanation our counter-factual simulations:

\[ \epsilon_M=\left[\frac{\partial PR /PR}{\partial R / R}\right]^M = 0.77\left[\frac{s_{pr}}{s_{r}}\right]^M \left[\frac{\bar{R}}{\bar{PR}}\right]^M=0.34 \]

\[ \epsilon_{NYC}=\left[\frac{\partial PR /PR}{\partial R / R}\right]^{NYC} = 0.72\left[\frac{s_{pr}}{s_{r}}\right]^{NYC} \left[\frac{\bar{R}}{\bar{PR}}\right]^{NYC}=0.65 \]

\[ \epsilon^{CounterFactual}_M=\left[\frac{\partial PR /PR}{\partial R / R}\right]^M = 0.77\left[\frac{s_{pr}}{s_{r}}\right]^M \left[\frac{\bar{R}}{\bar{PR}}\right]^{NYC}=1.35 \]

The result is shown below:

library(dplyr)
library(zoo)

mtrbeta <- readRDS("beta_mtr.RDS")
trbeta <- readRDS("beta_tr.RDS")

lag = 1:21
w = 7

mel <- c()
sd <- c()
xy <- c()
mcor <- c()
beta_nyc <- c()

mdf <- nyc[-c(1:2), c(2,5)]
n <- nrow(mdf)-w 

for(j in 1:n){
  cor <-c()
  sdd <- c()
  xyy <- c()
  el <- c()
  
  for(i in 1:length(lag)) {
    mdf1 <- mdf %>% mutate(dprL = lead(pr, lag[i]))
    mdf2 <- mdf1[complete.cases(mdf1), ]  
    k <- j-1
    mdf3 <- mdf2[j:(w+k), ]
    cor[i] <- cor(mdf3[,1], mdf3[,3])
    sdd[i] <- sd(mdf3[,3])/sd(mdf3[,1])
    xyy[i] <- abs(mean(mdf3[,1]))/mean(mdf3[,3])
    el[i] <- cor[i]*sdd[i]*xyy[i]
  }
  mcor[j] <- cor[which(cor == max(cor, na.rm = TRUE))]
  sd[j] <- sdd[which(cor == max(cor, na.rm = TRUE))]
  xy[j] <- xyy[which(cor == max(cor, na.rm = TRUE))]
  mel[j] <- mcor[j]*sd[j]*xy[j]
  beta_nyc[j] <- mcor[j]*sd[j]
}

roll <- mel[200:331]
roll_mtr <- mtrbeta[200:327]*xy[200:331]
roll_tr <- trbeta[142:259]*xy[142:259]

loerc <- loess(roll~c(1:length(roll)), degree=2, span = 0.05)
fitrc <- predict(loerc, 1:length(roll))

loerc_mtr <- loess(roll_mtr~c(1:length(roll_mtr)), degree=2, span = 0.05)
fitrc_mtr <- predict(loerc_mtr, 1:length(roll_mtr))

loerc_tr <- loess(roll_tr~c(1:length(roll_tr)), degree=2, span = 0.05)
fitrc_tr <- predict(loerc_tr, 1:length(roll_tr))

plot(roll, type = "h", lwd = 2, col = "pink",
     main = "Responsiveness of Montreal's PR in NYC between Sep. 26 & Jan. 24",
     xlab = "Days", ylab = "Elasticity",
     ylim = c(0,3.6), cex.main = 1)
lines(fitrc, col = "red", lwd = 1.5)
lines(fitrc_mtr, col = "blue", lwd = 1.5)
#lines(fitrc_tr, col = "darkgreen", lwd = 1.5)
abline(a = mean(roll), b = 0, type = "n", col = "red",
       lty = "dotted")
abline(a = mean(roll_mtr), b = 0, type = "n", col = "blue",
       lty = "dotted")
text(35, 0.9 , paste0("mean elasticity = ", 
                      round(mean(roll), 4), " for NYC"))
text(37.5, 1.5 , paste0("mean elasticity = ",
                      round(mean(roll_mtr), 4), " for Montreal"))
#text(35, 4.5 , paste0("mean elasticity = ",
#                      round(mean(roll_tr), 4), " for Toronto"))
legend("topright", legend = c("NYC", "Montreal"),
       lwd = 2, col = c("red", "blue"), cex = 0.5)

In order to have this much jump in the elasticity, two things have to be true:

  • The magnitude of mobility restrictions should be much higher relative to the spread (\(\frac{\bar{R}}{\bar{PR}}\)) in New York than in Montreal;
  • The changes in mobility should have a much lower variation through time relative to the changes in positivity rates (\(\frac{s_{PR}}{s_{r}}\)) in Montreal compared to New York.

The second one indicates that, given that the mobility metrics rather measure the people’s behavioral response to the spread (Herby, 2021), this difference implies a significantly lower public sensitivity to COVID-19 in Montreal than other cities. The first one is much more a policy indicator and consistent with the second one: the average reduction in mobility relative to the spread might not have been enough in Montreal in terms of its magnitude and speed compared to NYC.

6. Concluding Remarks

For the first time in history, we have now multiple mobility metrics at finer spatial and temporal scales. The question is very simple: do we have a significant and meaningful relationship between the transmission of viral infection and the geotemporal fluctuations in mobility? Although it looks like a simple question that can be answered with available data, there are fundamental challenges that make the question a very complex problem:

More coming soon…

In this short work, we intend to show basic difficulties and possible solutions in capturing the underlying relationship between mobility restrictions and the COVID-19 spread. Our results show that:

 


A work by ML-Portal